13 research outputs found

    Identification of differentially expressed genes by means of outlier detection

    Get PDF
    Background: An important issue in microarray data is to select, from thousands of genes, a small number of informative differentially expressed (DE) genes which may be key elements for a disease. If each gene is analyzed individually, there is a big number of hypotheses to test and a multiple comparison correction method must be used. Consequently, the resulting cut-off value may be too small. Moreover, an important issue is the selection's replicability of the DE genes. We present a new method, called ORdensity, to obtain a reproducible selection of DE genes. It takes into account the relation between all genes and it is not a gene-by-gene approach, unlike the usually applied techniques to DE gene selection. Results: The proposed method returns three measures, related to the concepts of outlier and density of false positives in a neighbourhood, which allow us to identify the DE genes with high classification accuracy. To assess the performance of ORdensity, we used simulated microarray data and four real microarray cancer data sets. The results indicated that the method correctly detects the DE genes; it is competitive with other well accepted methods; the list of DE genes that it obtains is useful for the correct classification or diagnosis of new future samples and, in general, it is more stable than other procedures. Conclusions: ORdensity is a new method for identifying DE genes that avoids some of the shortcomings of the individual gene identification and it is stable when the original sample is changed by subsamples.The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article. This study was partially supported: II by the Spanish Ministerio de Economia y Competitividad (TIN2015-64395-R) and by the Basque Government Research Team Grant (IT313-10) SAIOTEK ProjectSA-2013/00397 and by the University of the Basque Country UPV/EHU (Grant UFI11/45 (BAILab). CA by the Spanish Ministerio de Economia y Competitividad (SAF2015-68341-R), by the Spanish Ministerio de Economia y Competitividad (TIN2015-64395-R) and by Grant 2014 SGR 464 (GRBIO) from the Departament d'Economia i Coneixement de la Generalitat de Catalunya. The funders had no role in the study design, data collection and interpretation, or the decision to submit the work for publication

    Informatika ingeniaritzan emakumezko eta gizonezko ikasleek duten jarreraren azterketa konparatiboa

    Get PDF
    Azken hamarkadetan emakumezko ikasleen ehunekoa nabarmen jaitsi da UPV/EHUko Informatika Ingeniaritzako titulazioan. Fenomeno hori ez da UPV/EHUn bakarrik gertatzen, mundu mailan ere gauza bera gertatzen dela adierazten baitute datuek. Informatika Ingeniaritzan gertatzen den genero-ezberdintasunak aztertzeko ikasketen bukaerako proiektu bat garatu zen, bi helburu nagusi hartuta. Lehenengoa, ikasleei informazioa helaraztea ezberdintasun horren inguruan hausnar zezaten, eta bigarrena, genero ezberdintasunari buruzko ikasleen jarrera hobeto ezagutzea. Azken honetarako inkesta baten bidez jaso ziren ikasleen iritziak, eta ondoren, estatistikoki aztertu ziren. Itziar Cortés ikasleak egin zuen karrera bukaerako proiektu hori eta lan honetan jaso dira lortutako emaitzak

    Fuzzy classification with distance-based depth prototypes: High-dimensional unsupervised and/or supervised problems

    Get PDF
    Supervised and unsupervised classification is crucial in many areas where different types of data sets are common, such as biology, medicine, or industry, among others. A key consideration is that some units are more typical of the group they belong to than others. For this reason, fuzzy classification approaches are necessary. In this paper, a fuzzy supervised classification method, which is based on the construction of prototypes, is proposed. The method obtains the prototypes from an objective function that includes label information and a distance-based depth function. It works with any distance and it can deal with data sets of a wide nature variety. It can further be applied to data sets where the use of Euclidean distance is not suitable and to high-dimensional data (data sets in which the number of features is larger than the number of observations , often written as ). In addition, the model can also cope with unsupervised classification, thus becoming an interesting alternative to other fuzzy clustering methods. With synthetic data sets along with high-dimensional real biomedical and industrial data sets, we demonstrate the good performance of the supervised and unsupervised fuzzy proposed procedures.This research was partially supported: II by the Spanish ‘Ministerio de Economia y Competitividad’ (PID2019-106942RB-C31). CA by grant 2021SGR01421 (GRBIO) from the Departament de Economia i Coneixement de la Generalitat de Catalunya, Spain. II, CA and BS by the Spanish ‘Ministerio de Economia Competitividad’ (PID2021-122402OB-C21)

    ORdensity: user-friendly R package to identify differentially expressed genes

    Get PDF
    Background Microarray technology provides the expression level of many genes. Nowadays, an important issue is to select a small number of informative differentially expressed genes that provide biological knowledge and may be key elements for a disease. With the increasing volume of data generated by modern biomedical studies, software is required for effective identification of differentially expressed genes. Here, we describe an R package, called ORdensity, that implements a recent methodology (Irigoien and Arenas, 2018) developed in order to identify differentially expressed genes. The benefits of parallel implementation are discussed. Results ORdensity gives the user the list of genes identified as differentially expressed genes in an easy and comprehensible way. The experimentation carried out in an off-the-self computer with the parallel execution enabled shows an improvement in run-time. This implementation may also lead to an important use of memory load. Results previously obtained with simulated and real data indicated that the procedure implemented in the package is robust and suitable for differentially expressed genes identification. Conclusions The new package, ORdensity, offers a friendly and easy way to identify differentially expressed genes, which is very useful for users not familiar with programming. Availability https://github.com/rsait/ORdensityThe authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article. This study was partially supported: II by the Spanish Ministerio de Economia y Competitividad (TIN2015-64395-R; PROSA-MED: TIN2016-77820-C3-1-R) and by the Basque Government Research Team Grant (IT313-10) SAIOTEK ProjectSA-2013/00397 and by the University of the Basque Country UPV/EHU (Grant UFI11/45 (BAILab). CA by the Spanish Ministerio de Economia y Competitividad (RTI2018-093337-B-I00), by the Spanish Ministerio de Economia y Competitividad((RTI2018-100968-B-I00) and by Grant 2017SGR622 (GRBIO) from the Departament d'Economia i Coneixement de la Generalitat de Catalunya. The funders had no role in the study design, data collection and interpretation, or the decision to submit the work for publication

    K nearest neighbor equality: giving equal chance to all existing classes

    Get PDF
    The nearest neighbor classification method assigns an unclassified point to the class of the nearest case of a set of previously classified points. This rule is independent of the underlying joint distribution of the sample points and their classifications. An extension to this approach is the k-NN method, in which the classification of the unclassified point is made by following a voting criteria within the k nearest points. The method we present here extends the k-NN idea, searching in each class for the k nearest points to the unclassified point, and classifying it in the class which minimizes the mean distance between the unclassified point and the k nearest points within each class. As all classes can take part in the final selection process, we have called the new approach k Nearest Neighbor Equality (k-NNE). Experimental results we obtained empirically show the suitability of the k-NNE algorithm, and its effectiveness suggests that it could be added to the current list of distance based classifiers.This work has been supported by the Basque Country University and by the Basque Government under the research team grant program

    Estatistika metodoak distantzietan oinarrituriko ikuspegitik

    Get PDF
    Lan honetan analisi anizkoitzaren baitan biltzen diren distantzietan oinarrituriko hainbat metodoren berrikusketa egin da. Oinarrizko kontzeptuak aurkeztu dira lehenik, ondoren metodoen funtsa laburki azaldu ahal izateko. Zehazki, erregresioa, diskriminazio-analisia, cluster-analisia, tipikotasuna eta sakonera aztertzen dituzten metodoak bildu ditugu. Metodologia honek berezitasun nabarmena du, edozein datu motaren gainean aplikagarria baita, datuek erakusten duten banaketa zein den ezagutu beharrik gabe. Azkenik, metodo hauen erabilgarritasuna erakusteko, funtzio-datu errealen gainean aplikatu eta lortutako emaitzak azaldu dira

    Using Common Spatial Patterns to Select Relevant Pixels for Video Activity Recognition

    Get PDF
    first_page settings Open AccessArticle Using Common Spatial Patterns to Select Relevant Pixels for Video Activity Recognition by Itsaso Rodríguez-Moreno * [OrcID] , José María Martínez-Otzeta [OrcID] , Basilio Sierra [OrcID] , Itziar Irigoien , Igor Rodriguez-Rodriguez and Izaro Goienetxea [OrcID] Department of Computer Science and Artificial Intelligence, University of the Basque Country, Manuel Lardizabal 1, 20018 Donostia-San Sebastián, Spain * Author to whom correspondence should be addressed. Appl. Sci. 2020, 10(22), 8075; https://doi.org/10.3390/app10228075 Received: 1 October 2020 / Revised: 30 October 2020 / Accepted: 11 November 2020 / Published: 14 November 2020 (This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ) Download PDF Browse Figures Abstract Video activity recognition, despite being an emerging task, has been the subject of important research due to the importance of its everyday applications. Video camera surveillance could benefit greatly from advances in this field. In the area of robotics, the tasks of autonomous navigation or social interaction could also take advantage of the knowledge extracted from live video recording. In this paper, a new approach for video action recognition is presented. The new technique consists of introducing a method, which is usually used in Brain Computer Interface (BCI) for electroencephalography (EEG) systems, and adapting it to this problem. After describing the technique, achieved results are shown and a comparison with another method is carried out to analyze the performance of our new approach.This work has been partially funded by the Basque Government, Research Teams grant number IT900-16, ELKARTEK 3KIA project KK-2020/00049, and the Spanish Ministry of Science (MCIU), the State Research Agency (AEI), and the European Regional Development Fund (FEDER), grant number RTI2018-093337-B-I100 (MCIU/AEI/FEDER, UE). We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research

    Proceedings of the 35th International Workshop on Statistical Modelling : July 20- 24, 2020 Bilbao, Basque Country, Spain

    Get PDF
    466 p.The InternationalWorkshop on Statistical Modelling (IWSM) is a reference workshop in promoting statistical modelling, applications of Statistics for researchers, academics and industrialist in a broad sense. Unfortunately, the global COVID-19 pandemic has not allowed holding the 35th edition of the IWSM in Bilbao in July 2020. Despite the situation and following the spirit of the Workshop and the Statistical Modelling Society, we are delighted to bring you the proceedings book of extended abstracts

    Proceedings of the 35th International Workshop on Statistical Modelling : July 20- 24, 2020 Bilbao, Basque Country, Spain

    Get PDF
    466 p.The InternationalWorkshop on Statistical Modelling (IWSM) is a reference workshop in promoting statistical modelling, applications of Statistics for researchers, academics and industrialist in a broad sense. Unfortunately, the global COVID-19 pandemic has not allowed holding the 35th edition of the IWSM in Bilbao in July 2020. Despite the situation and following the spirit of the Workshop and the Statistical Modelling Society, we are delighted to bring you the proceedings book of extended abstracts
    corecore